Combining Knowledge and Data Driven Insights for Identifying Risk Factors using Electronic Health Records

نویسندگان

  • Jimeng Sun
  • Jianying Hu
  • Dijun Luo
  • Marianthi Markatou
  • Fei Wang
  • Shahram Ebadollahi
  • Zahra Daar
  • Walter F. Stewart
چکیده

BACKGROUND The ability to identify the risk factors related to an adverse condition, e.g., heart failures (HF) diagnosis, is very important for improving care quality and reducing cost. Existing approaches for risk factor identification are either knowledge driven (from guidelines or literatures) or data driven (from observational data). No existing method provides a model to effectively combine expert knowledge with data driven insight for risk factor identification. METHODS We present a systematic approach to enhance known knowledge-based risk factors with additional potential risk factors derived from data. The core of our approach is a sparse regression model with regularization terms that correspond to both knowledge and data driven risk factors. RESULTS The approach is validated using a large dataset containing 4,644 heart failure cases and 45,981 controls. The outpatient electronic health records (EHRs) for these patients include diagnosis, medication, lab results from 2003-2010. We demonstrate that the proposed method can identify complementary risk factors that are not in the existing known factors and can better predict the onset of HF. We quantitatively compare different sets of risk factors in the context of predicting onset of HF using the performance metric, the Area Under the ROC Curve (AUC). The combined risk factors between knowledge and data significantly outperform knowledge-based risk factors alone. Furthermore, those additional risk factors are confirmed to be clinically meaningful by a cardiologist. CONCLUSION We present a systematic framework for combining knowledge and data driven insights for risk factor identification. We demonstrate the power of this framework in the context of predicting onset of HF, where our approach can successfully identify intuitive and predictive risk factors beyond a set of known HF risk factors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Effective Factors related to Implementation of Electronic Health Records in Imam Khomeini Hospital, Tehran

Background: With the advancement of science and emergence of new technologies for solving human health and medical problems, one of the most important applications of technology in the field of health care is creation of electronic health records. The purpose of this study was to determine the effective internal and external factors related to successful implementation of the electronic health ...

متن کامل

The Content and Structure of Electronic Personal Health Records: A Systematic Review

Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...

متن کامل

The Content and Structure of Electronic Personal Health Records: A Systematic Review

Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...

متن کامل

Identifying and Ranking Development Drivers of Knowledge-based Technology-Driven Companies (Case study: Fars Province Science and Technology Park)

The purpose of this Study study is to identify and rank the development drivers of knowledge-based, technology-driven businesses. This work is conducted as a case study in Fars Province Science and Technology Park. It is a descriptive survey in terms of purpose since a part of its data is collected through questionnaires and is of surveying type because it describes the existing conditions. The...

متن کامل

Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis

OBJECTIVES 1) To use data-driven method to examine clinical codes (risk factors) of a medical condition in primary care electronic health records (EHRs) that can accurately predict a diagnosis of the condition in secondary care EHRs. 2) To develop and validate a disease phenotyping algorithm for rheumatoid arthritis using primary care EHRs. METHODS This study linked routine primary and second...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره 2012  شماره 

صفحات  -

تاریخ انتشار 2012